具有基于框架的相机的光流计算具有很高的精度,但速度受算法的模型大小或相机帧速率的限制。这使得它不适用于高速应用程序。事件摄像机提供连续的异步事件流,以克服帧速率限制。但是,用于处理数据借用框架之类的算法,例如设置限制速度,或者遭受较低的精度。我们融合了框架和基于事件的管道的互补准确性和速度优势,以提供高速光流,同时保持较低的错误率。我们的生物模仿网络通过MVSEC数据集进行了验证,该数据集以4倍速度上升显示了19%的误差降解。然后,我们通过高速无人机飞行方案演示了系统,该方案甚至在光学摄像头看到无人机使其适用于跟踪和分割等应用程序之前,高速事件摄像头也能计算流程。这项工作表明,可以通过融合来自其他方式的数据来克服基于框架的处理中的基本权衡。
translated by 谷歌翻译
基于学习的导航系统广泛用于自主应用,例如机器人,无人驾驶车辆和无人机。已经提出了专门的硬件加速器,以实现这种导航任务的高性能和能效。然而,硬件系统中的瞬态和永久性故障正在增加,并且可以灾难性地违反任务安全。同时,传统的基于冗余的保护方法挑战,用于部署资源受限的边缘应用。在本文中,我们通过从RL训练和推理的算法,对算法,故障模型和数据类型进行了实验评估导航系统的恢复性。我们进一步提出了两种有效的故障缓解技术,实现了基于学习的导航系统的2倍成功率和39%的飞行质量改进。
translated by 谷歌翻译
Inspired by strategies like Active Learning, it is intuitive that intelligently selecting the training classes from a dataset for Zero-Shot Learning (ZSL) can improve the performance of existing ZSL methods. In this work, we propose a framework called Diverse and Rare Class Identifier (DiRaC-I) which, given an attribute-based dataset, can intelligently yield the most suitable "seen classes" for training ZSL models. DiRaC-I has two main goals - constructing a diversified set of seed classes, followed by a visual-semantic mining algorithm initialized by these seed classes that acquires the classes capturing both diversity and rarity in the object domain adequately. These classes can then be used as "seen classes" to train ZSL models for image classification. We adopt a real-world scenario where novel object classes are available to neither DiRaC-I nor the ZSL models during training and conducted extensive experiments on two benchmark data sets for zero-shot image classification - CUB and SUN. Our results demonstrate DiRaC-I helps ZSL models to achieve significant classification accuracy improvements.
translated by 谷歌翻译
Modern telecom systems are monitored with performance and system logs from multiple application layers and components. Detecting anomalous events from these logs is key to identify security breaches, resource over-utilization, critical/fatal errors, etc. Current supervised log anomaly detection frameworks tend to perform poorly on new types or signatures of anomalies with few or unseen samples in the training data. In this work, we propose a meta-learning-based log anomaly detection framework (LogAnMeta) for detecting anomalies from sequence of log events with few samples. LoganMeta train a hybrid few-shot classifier in an episodic manner. The experimental results demonstrate the efficacy of our proposed method
translated by 谷歌翻译
Zero-shot detection (ZSD) is a challenging task where we aim to recognize and localize objects simultaneously, even when our model has not been trained with visual samples of a few target ("unseen") classes. Recently, methods employing generative models like GANs have shown some of the best results, where unseen-class samples are generated based on their semantics by a GAN trained on seen-class data, enabling vanilla object detectors to recognize unseen objects. However, the problem of semantic confusion still remains, where the model is sometimes unable to distinguish between semantically-similar classes. In this work, we propose to train a generative model incorporating a triplet loss that acknowledges the degree of dissimilarity between classes and reflects them in the generated samples. Moreover, a cyclic-consistency loss is also enforced to ensure that generated visual samples of a class highly correspond to their own semantics. Extensive experiments on two benchmark ZSD datasets - MSCOCO and PASCAL-VOC - demonstrate significant gains over the current ZSD methods, reducing semantic confusion and improving detection for the unseen classes.
translated by 谷歌翻译
Large pretrained Transformer-based language models like BERT and GPT have changed the landscape of Natural Language Processing (NLP). However, fine tuning such models still requires a large number of training examples for each target task, thus annotating multiple datasets and training these models on various downstream tasks becomes time consuming and expensive. In this work, we propose a simple extension of the Prototypical Networks for few-shot text classification. Our main idea is to replace the class prototypes by Gaussians and introduce a regularization term that encourages the examples to be clustered near the appropriate class centroids. Experimental results show that our method outperforms various strong baselines on 13 public and 4 internal datasets. Furthermore, we use the class distributions as a tool for detecting potential out-of-distribution (OOD) data points during deployment.
translated by 谷歌翻译
深度学习(DL)系统的安全性是一个极为重要的研究领域,因为它们正在部署在多个应用程序中,因为它们不断改善,以解决具有挑战性的任务。尽管有压倒性的承诺,但深度学习系统容易受到制作的对抗性例子的影响,这可能是人眼无法察觉的,但可能会导致模型错误分类。对基于整体技术的对抗性扰动的保护已被证明很容易受到更强大的对手的影响,或者证明缺乏端到端评估。在本文中,我们试图开发一种新的基于整体的解决方案,该解决方案构建具有不同决策边界的防御者模型相对于原始模型。通过(1)通过一种称为拆分和剃须的方法转换输入的分类器的合奏,以及(2)通过一种称为对比度功能的方法限制重要特征,显示出相对于相对于不同的梯度对抗性攻击,这减少了将对抗性示例从原始示例转移到针对同一类的防御者模型的机会。我们使用标准图像分类数据集(即MNIST,CIFAR-10和CIFAR-100)进行了广泛的实验,以实现最新的对抗攻击,以证明基于合奏的防御的鲁棒性。我们还在存在更强大的对手的情况下评估稳健性,该对手同时靶向合奏中的所有模型。已经提供了整体假阳性和误报的结果,以估计提出的方法的总体性能。
translated by 谷歌翻译
我们提出了MDEAW,这是一个多模式数据库,该数据库由电动活动(EDA)和光摄影学(PPG)信号组成,在考试期间记录了巴塞罗那萨巴德尔(Eurecat Academemy)的老师教师教授的课程,以引起对学生对学生对情感反应的情感反应。课堂场景。以6种基本的情感状态来记录了10名学生的信号以及学生对每个刺激后对情感状态的自我评估。所有信号均使用便携式,可穿戴,无线,低成本和现成的设备捕获,该设备有可能在日常应用中使用情感计算方法。使用基于EDA和PPG的功能及其融合的学生识别的基线是通过remecs,fed-emecs和fed-emecs-u建立的。这些结果表明,使用低成本设备进行情感状态识别应用的前景。提出的数据库将公开可用,以使研究人员能够对这些捕获设备对情绪状态识别应用的适用性进行更透彻的评估。
translated by 谷歌翻译
由于对抗性攻击的存在,深度学习分类器的安全性是一个关键的研究领域。这种攻击通常依赖于可转移性的原则,其中在代理分类器上制作的对手示例倾向于误导目标分类器,即使两个分类器都有相当不同的架构,也要误导目标分类器。抗逆性攻击的集合方法表明,对抗性示例的可能性不太可能在具有不同决策边界的集合中误导多个分类器。然而,最近的集合方法已被证明是易受强烈的对手或表现出缺乏结束到最终评估的影响。本文试图开发一种新的集合方法,该方法在训练过程中使用成对对手稳健的损失(PARL)功能来构造多种不同分类器。 PARL在同时在集合中的每个分类器中输入每个层的梯度。与之前的集合方法相比,建议的培训程序使PARL能够实现对黑盒转移攻击的更高稳健性,而不会对清洁实例的准确性产生不利影响。我们还评估了白盒攻击存在下的稳健性,其中使用目标分类器的参数制作了对抗示例。我们使用标准图像分类数据集在使用标准Reset20分类器培训的标准图像分类数据集目前,使用标准Reset20分类器,以展示所提出的集合方法的稳健性。
translated by 谷歌翻译
最受欢迎的目标导向的对话代理能够理解会话环境。然而,随着虚拟助手的激增,需要下一代代理商也需要了解屏幕上下文,以提供适当的互动体验,更好地了解用户的目标。在本文中,我们提出了一种新颖的多式联合会话框架,其中对话代理的下一个行动及其参数在会话和视觉上下文中共同调节。具体而言,我们提出了一个新的模型,可以在对话中的视觉上下文中推理,并使用给定用户查询的视觉实体填充API参数。我们的模型可以识别颜色和形状等视觉功能以及基于元数据的特征,例如与视觉实体相关联的价格或星级。为了训练我们的模型,由于缺乏合适的多模式会话数据集,我们还提出了一种新颖的多模式对话框模拟器来生成合成数据,并从MTurk收集现实用户数据以提高模型鲁棒性。该建议的模型实现了合理的85%模型精度,而无需高推理延迟。我们还展示了用于多模式虚拟助手的原型家具购物体验中所提出的方法。
translated by 谷歌翻译